Cross-Lingual Sentiment Classification with Bilingual Document Representation Learning

نویسندگان

  • Xinjie Zhou
  • Xiaojun Wan
  • Jianguo Xiao
چکیده

Cross-lingual sentiment classification aims to adapt the sentiment resource in a resource-rich language to a resource-poor language. In this study, we propose a representation learning approach which simultaneously learns vector representations for the texts in both the source and the target languages. Different from previous research which only gets bilingual word embedding, our Bilingual Document Representation Learning model BiDRL directly learns document representations. Both semantic and sentiment correlations are utilized to map the bilingual texts into the same embedding space. The experiments are based on the multilingual multi-domain Amazon review dataset. We use English as the source language and use Japanese, German and French as the target languages. The experimental results show that BiDRL outperforms the state-of-the-art methods for all the target languages.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Attention-based LSTM Network for Cross-Lingual Sentiment Classification

Most of the state-of-the-art sentiment classification methods are based on supervised learning algorithms which require large amounts of manually labeled data. However, the labeled resources are usually imbalanced in different languages. Cross-lingual sentiment classification tackles the problem by adapting the sentiment resources in a resource-rich language to resource-poor languages. In this ...

متن کامل

A Novel Two-Step Method for Cross Language Representation Learning

Cross language text classification is an important learning task in natural language processing. A critical challenge of cross language learning arises from the fact that words of different languages are in disjoint feature spaces. In this paper, we propose a two-step representation learning method to bridge the feature spaces of different languages by exploiting a set of parallel bilingual doc...

متن کامل

A Subspace Learning Framework for Cross-Lingual Sentiment Classification with Partial Parallel Data

Cross-lingual sentiment classification aims to automatically predict sentiment polarity (e.g., positive or negative) of data in a label-scarce target language by exploiting labeled data from a label-rich language. The fundamental challenge of cross-lingual learning stems from a lack of overlap between the feature spaces of the source language data and that of the target language data. To addres...

متن کامل

Semi-Supervised Representation Learning for Cross-Lingual Text Classification

Cross-lingual adaptation aims to learn a prediction model in a label-scarce target language by exploiting labeled data from a labelrich source language. An effective crosslingual adaptation system can substantially reduce the manual annotation effort required in many natural language processing tasks. In this paper, we propose a new cross-lingual adaptation approach for document classification ...

متن کامل

Cross-lingual Sentiment Lexicon Learning With Bilingual Word Graph Label Propagation

In this article we address the task of cross-lingual sentiment lexicon learning, which aims to automatically generate sentiment lexicons for the target languages with available English sentiment lexicons. We formalize the task as a learning problem on a bilingual word graph, in which the intra-language relations among the words in the same language and the interlanguage relations among the word...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016